This research proposes a machine learning–powered system for identifying and bridging digital skill gaps in underprivileged learner personas using public freelance job market data. The solution evaluates skill-job fit through Jaccard similarity and recommends targeted learning paths to improve freelance readiness. The frontend, developed in Streamlit, visualizes metrics such as fit score, skill coverage, and time-to-learn, offering an interactive and measurable framework for digital inclusion. The model shows clear improvements in simulated learner profiles, demonstrating scalable potential for skill development and economic upliftment.
Introduction
Freelancing offers economic opportunity, but many underprivileged individuals lack access due to gaps in digital skills and guidance. Traditional education often doesn't align with market needs. This study proposes an AI-powered skill recommendation system to help bridge that gap by:
Simulating learner personas using socio-economic data
Analyzing freelance job trends (from platforms like Upwork and Freelancer.com)
Matching skills to jobs using Jaccard similarity
Recommending high-impact, low-effort skills
Presenting results via an interactive Streamlit dashboard
For example, a simulated user "Ayesha Khan" from an urban slum with basic skills improved her job market fit score from 0.000 to 0.083 after completing ~60 hours of recommended learning. Skills included writing and creative data presentation.
Key takeaway:
The system shows that even users with minimal digital access can meaningfully improve their freelancing readiness through personalized, AI-driven upskilling, promoting digital inclusion and equitable access to the gig economy.
Conclusion
This research demonstrates how AI can be harnessed for social good by identifying skill gaps in underprivileged individuals and recommending personalized upskilling paths for entry into the freelancing economy. Using real job data and simulated persona profiles, the system successfully quantified employability improvements, as seen in the case of Ayesha Khan — who, with just 60 hours of targeted learning, achieved a 0.083 boost in fit score and 9.1% skill coverage.
The platform provides a scalable, data-driven approach to bridge the digital divide and promote inclusive economic participation
References
Datasets
[1] Upwork Job Postings Dataset (2024)
PromptCloudHQ.(2024). Upwork Job Postings Dataset. Kaggle.URL: https://www.kaggle.com/datasets/PromptCloudHQ/upwork-job-postings-dataset-2024Description: Contains detailed job postings from Upwork, including job titles, required skills, categories, and descriptions, used for extracting in-demand freelance skills and market trends.
[2] Freelancer Data Analysis Jobs
Andrew Mvd. (2021).Freelancer Data Analysis Jobs. Kaggle.URL: https://www.kaggle.com/datasets/andrewmvd/freelancer-data-analysis-jobsDescription: A dataset of job postings from Freelancer.com, focused on data analysis roles, used to supplement skill demand analysis.
[3] Skill and Career Recommendation Dataset
Sourav Banerjee. (2022). Skill and Career Recommendation Dataset. Kaggle.URL: https://www.kaggle.com/datasets/iamsouravbanerjee/skill-and-career-recommendation-datasetDescription: Provides simulated user profiles and associated skills, used for persona generation and skill gap simulation.
[4] Lloyds Consumer Digital Index 2022
Lloyds Bank. (2022). Lloyds Consumer Digital Index 2022.URL: https://www.lloydsbank.com/assets/media/pdfs/banking_with_us/whats-happening/210421-lloyds-consumer-digital-index-2022.pdfDescription: Annual report on digital skills and inclusion in the UK, used for contextualizing digital readiness among marginalized groups.
[5] ONS Unemployment by Region 2021
Office for National Statistics. (2021). Unemployment by
Region of Residence (UNEM01).URL: https://www.ons.gov.uk/employmentandlabourmarket/peoplenotinwork/unemployment/datasets/unemploymentbyregionofresidenceunem01Description: Official UK statistics on unemployment by region, used to inform persona backgrounds and socioeconomic context.
Libraries & Tools
[1] pandas (Wes McKinney, 2010): Data manipulation and analysis.
[2] NumPy (Harris et al., 2020): Numerical computing.
[3] scikit-learn (Pedregosa et al., 2011): Machine learning algorithms (TF-IDF, clustering, etc.).
[4] Streamlit: Interactive web app framework for data science.
[5] matplotlib: Data visualization.
[6] Plotly: Interactive plotting.
Key Methods/Models
[1] TF-IDF (Term Frequency-Inverse Document Frequency): For skill extraction and similarity.
[2] SBERT (Sentence-BERT): For advanced skill matching (if used).
Related Work
[1] Cedefop (2018). Insights into skill shortages and skill mismatch: Learning from Cedefop’s European skills and jobs survey.
[2] World Economic Forum. (2020). The Future of Jobs Report 2020.
[3] S. K. Dwivedi et al. (2020). A framework for skill gap analysis using text mining and machine learning.
Project Repository
[1] Jagrit0711. (2024). AI-Based Skill Gap Identification for Freelancing Readiness in Marginalized Group. GitHub.
[2] URL: https://github.com/Jagrit0711/AI-Based-Skill-Gap-Identification-for-Freelancing-Readiness-in-Marginalized-Group